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Abstract 

We prove a near optimal round-communication tradeoff for the two-party quantum commu¬ 
nication complexity of disjointness. For protocols with r rounds, we prove a lower bound of 
U(n/r + r) on the communication required for computing disjointness of input size n, which 
is optimal up to logarithmic factors. The previous best lower bound was Q(n/r 2 + r ) due to 
Jain, Radhakrishnan and Sen [JRS03] . Along the way, we develop several tools for quantum 
information complexity, one of which is a lower bound for quantum information complexity in 
terms of the generalized discrepancy method. As a corollary, we get that the quantum com¬ 
munication complexity of any boolean function / is at most where QIC(f) is the 

prior-free quantum information complexity of / (with error 1/3). 
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1 Introduction 


We prove near-optimal bounds on the bounded-round quantum communication complexity of dis¬ 
jointness. Quantum communication complexity, introduced by Yao |Yao93j . studies the amount of 
quantum communication that two parties, Alice and Bob, need to exchange in order to compute a 
function (usually boolean) of their private inputs. It is the natural quantum extension of classical 
communication complexity |Yao79| . While the inputs are classical and the end result is classical, 
the players are allowed to use quantum resources while communicating. The motivation for the 
introduction of quantum communication was to study questions in quantum computation. For 
example, in |Yao93| . Yao used it to prove that the majority function does not have any linear size 
quantum formulas. 

While quantum communication (with entanglement) offers only a factor of 2 savings when 
transmitting n bits of classical information [Hol73l IBW921 1 CvDNT 98] . it can still offer super¬ 
constant savings (and sometimes exponential) in communication if the goal is just to compute a 
boolean function of the inputs. For total boolean functions, the best-known separation between 
classical and quantum communication is quadratic, for the disjointness function }KS92l IR,az921 
IGro96l IBCW981 IAA03| . It is, in fact, a major open problem whether classical and quantum 
communication are polynomially related for all total boolean functions. For partial functions, 
exponential separations are known even between one-way quantum communication and arbitrary 
classical communication [Raz99l iKR.llj . 

For disjointness with input size n, Grover’s search |Gro96l 1BBHT98] can be used to ob¬ 
tain a quantum communication protocol (with probability of error 1/3) with communication cost 
0(y/n\ogn) [BGW98] , The bound was later improved to 0(y/n) in |AA03] . The protocols attaining 
this upper bound are very interactive and require @(y/n) rounds of interaction. The 0(y/n) upper 
bound on the quantum communication complexity of disjointness has been shown to be tight in 
[Raz02] . 

If we restrict the players to allow only r rounds of interaction, then it is not hard to use the 
0(y/n) protocol discussed above as a black-box to obtain an 0(n/r) communication protocol for 
n > r 2 . The best known lower bound was fl(n/r 2 ) |JRS03] . We prove a lower bound of Q(n/r), 
which is optimal up to logarithmic factors: 

Theorem A. ('Theorem 17.31 rephrased) The r-round quantum communication complexity of DISJ n 
is n 


r log 8 (r ) t 

The analogous result for query complexity of quantum search, an fl(n/r) lower bound for 
the number of queries when r sets of nonadaptive queries are allowed, was known before [Zal99] , 
Our lower bound does not give a new proof of the H(y / n) bound on the quantum communication 
complexity of disjointness |Raz02| since our proof uses that lower bound (in fact we use something 
much stronger, a strengthening of the strong direct product theorem for disjointness |KSDW04| 
due to |Shel2| ). 

There is a rich history of papers studying lower bounds on bounded-round communication 
complexity, for example for the pointer jumping problem jNW931 IPRVOll lKla98l IKNTSZOl] . for 
sparse set disjointness [STT3] . for equality [BCK14] and several other examples. Most of these 
lower bounds are proven via a round elimination strategy: show that an r-round protocol can be 
converted into an (r — l)-round protocol without too much increase in communication cost and 
error; arrive at contradition by obtaining a too-good-to-be-true 1-round or 0-round protocol. Even 
the result of |.TRS03j can be viewed as round elimination on quantum information complexity of the 
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2-bit AND. Despite substantial effort, obtaining the optimal D(l/r) lower bound on the r-round 
quantum information complexity of AND via round elimination has remained elusive. We prove: 
Theorem B. (Corollary 17.21 rephrased) The r-round quantum information complexity of AND 
with prior 1/3,1/3,1/3, 0 is D ( rlog 1 » (r) ) • 

As discussed below, we obtain this result by using existing lower bounds for the communication 
complexity of quantum disjointness. A direct proof of a quantum information complexity lower 
bound for the 2-bit AND remains an intriguing open problem. In light of the fact that disjoint¬ 
ness has a sub-linear quantum communication complexity, it is not surprising that the quantum 
information complexity of AND vanishes with the number of rounds. This phenomenon is closely 
related to the Elitzur-Vaidman bomb tester jEV93l IKWHZ95] , which gives a sequence of quantum 
measurements that allows one to test whether a bomb is loaded without detonating it. The loss 
of the protocol (i.e. the probability that the bomb will explode — which loosely corresponds to 
the amount of information revealed about the bomb) behaves like 1/r, where r is the number of 
measurements performed. 

Our proof relies on the notion of quantum information complexity, defined recently in |Toul5j . 
where it is used to prove a direct sum theorem for constant round quantum communication. It is 
harder to manipulate quantum information than in the classical case, and tools that are standard in 
the classical setting are yet to be developed for the quantum case. However, it could still be useful 
in proving partial direct sum and direct product theorems, which we know in the classical world 
[BBCRIO] . [BRWY13] . Moreover, a model similar to that of quantum communication complexity 
is connected to proving SDP extension complexity lower bounds [JSWZ13] . Although the recent 
breakthrough for SDP lower bounds |LRS15| does not follow this direction, it is likely that a 
quantum information complexity viewpoint will provide further insights as information complexity 
has provided in the classical case (LP extension complexity) [BM131 IBP13j . Further development of 
tools for quantum communication and information complexity is likely to further the SDP extension 
complexity program. 

We also prove that for all boolean functions, prior-free quantum information complexity is lower 
bounded by the generalized discrepancy method: 

Theorem C. (Theorem 15.71 rephrased) For any boolean function f and a sufficiently small constant 
error i] > 0, the prior-free quantum information complexity of f with error r/ is lower bounded by 
the generalized discrepancy bound for f. 


Previously no lower bounds were known on the quantum information complexity of general 
boolean functions. Our proof relies on the strong direct product theorem for quantum communica¬ 
tion complexity in terms of the generalized discrepancy method |Shel2j . Note that in the classical 
setting such a result can be proven directly using zero-communication protocols [KLL+12] , It 
remains to be seen whether such a direct proof can be obtained in the quantum setting. 

As a corollary we also get that the quantum communication complexity of any boolean function 
is at most exponential in the prior-free quantum information complexity. 


Theorem D. (Corollary 15.81 rephrased) For any boolean function f, quantum communication 
complexity of f with error 1/3 is at most where QIC(f, 1/3) is the prior-free 

quantum information complexity of f with error 1/3. 

Note that the classical analogue of this is proven via a compression argument |Bral2| . but we 
prove this via an indirect argument. It would be interesting to prove this directly via a quantum 
compression argument. 
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2 Proof overview and discussion 

High-level strategy. At a high-level, the proof builds on the connection between quantum in¬ 
formation complexity and quantum communication complexity of the disjointness function DISJ m 
with various values of m. There are two parts to the proof: 

1. Suppose there is a r-round quantum protocol for disjointness of input size n > r 2 with 

communication cost - A —;—■ Then there exists a protocol for disjointness of input size r 2 

with quantum information cost < o(r). 

2. Lower bound on quantum information complexity of disjointness: we prove that the (prior- 
free) quantum information complexity of any boolean function is lower bounded by the gen¬ 
eralized discrepancy method, which by results in |She07| implies that quantum information 
complexity of disjointness with input size r 2 is Q(r). 

Note that these two steps imply a lower bound on the bounded round quantum communication 
complexity of disjointness. Also the above statements are about computation with some constant 
error (say 1/3). 

Both directions are proven via a connection between the information complexity of a problem 
and its communication complexity. In one direction, a protocol for a large sized disjointness can 
be converted into a low-information protocol for a smaller size disjointness. Using the converse 
direction of the connection, a low-information protocol for DISJ r 2 leads to a protocol for many 
copies of the problem that violate known direct product results. The former connection has been at 
the heart of many classical lower bounds involving information complexity |BYJKS04l IBGPW13a] . 
The latter connection (deriving information complexity lower bound from known communication 
lower bound on an “amortized” version of the problem) has been previously explored in the classical 
setting by jBGPW13b] , 

Let us start by giving a high level overview of the first step. If there is a r-round quantum 
protocol for disjointness of input size n with communication cost r .p 0 iyi 0 g( r ) and 1/3 probability 
of error, then by a direct sum argument in (Toul5j . there exists a r-round quantum protocol n 
for AND with 1/3 probability of error (for a worst case input) and quantum information cost 
< r .p 0 iyi 0 g( r ) w.r.t any distribution /i. s.t. /x(l, 1) = 0. Now we want to use n to obtain a low 
information protocol for disjointness of size r 2 . One can imagine if we run n on each coordinate 
of the disjointness instance, we get an r-round protocol r of information cost < po i y iog( r ) and also 
it solves disjointness with small error (assuming we first amplify the error of ir to 1/r 3 losing a 
log factor in information cost). However, the issue is that information cost of r is low only w.r.t. 
distributions iz supported on disjoint pairs of sets. The information cost of r may increase 
dramatically when it is run on a pair of sets with many intersections. To deal with this we use a 
trick used in |BGPW13a] , 

Note that if there are too many intersections in a disjointness instance, then the players can 
just subsample some of the coordinates and check for an intersection in those coordinates. Hence 
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we can assume wlog that the intersection size in a typical input distributed according to v is small. 
This means that if we look at a typical coordinate i, the marginal distribution Ui has small mass 
on (1,1). And in this case, we can run it on each coordinate. The only thing left to understand is: 
how does the information cost of n change if we place a small mass, say w, on (1,1)? The answer 
to this turns out to be r • H(w), where n has r-rounds. Note that this is in contrast to the classical 
case, where the answer would be just H(w). Later we will give an example of a quantum protocol 
for AND whose information cost does go up by r ■ H(w). Also this is the only place where we 
use the fact that the protocol we started with had only r rounds. Such a dependence is 
necessary here, since an D(n/r) lower bound for general (non-r-round) protocols would violate the 
0(y/n) upper bound. 

For the second step, we use compression along with a strong direct product theorem for quantum 
communication complexity of / in terms of the generalized discrepancy lower bound GDM 1 / 5 (f) 
due to Sherstov |Shel2| . It says that to compute k copies of a boolean function / with success 
probability it requires at least k ■ GDM 1 / 5 (/) qubits of communication (with arbitrary 

amount of entanglement). Note that a strong direct product theorem for quantum communication 
complexity of disjointness was already known [KSDW04] , but we need a stronger version for our 
proof which shows that even computing a large fraction of the copies is hard and Sherstov’s result 
also holds in this cas^]. 

Suppose there is a protocol ir for a function / with quantum information cost < I w.r.t a 
distribution p, and probability of error < e, then by quantum information equals amortized com¬ 
munication |Toul5] . we get a protocol 7r*. for f k which computes at least (1 — 2 e)k coordinates 
correctly with probability > 0.99 (w.r.t. p, k ) and QCC{TTk) < k ■ I + o(k). To apply Sherstov’s 
theorem, we need such a protocol which works for worst case inputs. We show how to obtain such 
a worst case to average case reduction, whence applying Sherstov’s result gives us the lower bound 
on information complexity. 

Discussion and open problems 

In its entirety our proof shows how from a r-round protocol for disjointness, one can obtain a 
protocol for k copies of disjointness of size r 1 2 . But to achieve this reduction, we have to move to 
information complexity, since the number of rounds r only comes up in an information theoretic 
context in our proof. 

Thus the reduction structure of the proof is communication—^information—^communication, 
with the latter communication problem having a known lower bound. Lower bounds for disjointness 
in the classical setting jBYJKS04l IBGPW13a| only do a reduction of the form communication —>• 
information, with an information complexity lower bound on the resulting problem proven directly. 

Open Problem 2.1. Give a direct proof of a lower bound for the information complexity of 
DISJ r 2. 

One possible attack route would be along the lines of the proof for the classical case using 
zero-communication protocols [KLL+12]. In the past, techniques developed for two-party quantum 
communication, e.g. the pattern matrix method |She07j . turned out to be useful for multiparty 

1 We could probably base our result off the lower bound of IKSDW04] . but the reduction would be considerably 

more complicated. 
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number-on-forehead communication [ CA081 IShel4| . It could be that techniques developed for 
quantum information also result in similar progress. 

Another natural question is whether the lower bound on the information complexity of AND 
can be proved using a direct argument: 

Open Problem 2.2. Give a direct proof of Theorem B. 

Even though efforts since [JRS03] to-date have been unsuccessful, it still could be possible to 
directly obtain Theorem B via round elimination or other techniques and that would be really 
interesting, since it would also yield a new proof of the lower bound for quantum communication 
complexity of disjointness (Raz021 IShe07j . The recent breakthrough results in lower bounding 
conditional quantum mutual information jFR14l IBHOS141IBT15] should be relevant. 

Remark 2.3. Our proofs can be adapted to show that the (unbounded round) zero-error quantum 
information complexity of AND w.r.t the prior (1 —e)/3, (1 —e)/3, (1 —e)/3,e is ^(yT). It is another 
intriguing question whether it is possible to have a direct proof for this. Note that this requires a 
global view of quantum information complexity, even though it is defined round by round. By a 
continuity argument this would also resolve open problem 12.21 

More generally, our understanding of the relationship between quantum information and com¬ 
munication complexity is in its early stages of development. Questions of interactive protocol 
compression occupy a central position in understanding the connection between classical informa¬ 
tion and communication complexity [BBCR 10. Bral2. GKR14J. In particular, jBBCRlfi] shows 
that a protocol it with information cost I and communication cost C can be compressed into a 
protocol with communication cost 0(y/I ■ C). It remains open whether this (or an analagous) fact 
is true in the quantum setting: 

Open Problem 2.4. Given a quantum protocol n over a distribution p of inputs whose communi¬ 
cation cost is C and whose quantum information cost is I, can n be simulated (with a small error) 
using a quantum protocol n' whose communication cost is 0(VI • C)? 


3 Preliminaries 

3.1 Quantum Information Theory 

We use the following notation for quantum theory; see }Watl3llWill3j for more details. We associate 
a quantum register A with a corresponding vector space, also denoted by A. We only consider finite¬ 
dimensional vector spaces. A state of quantum register A is represented by a density operator 
p € T>(A), with T>(A) the set of all unit trace, positive semi-definite linear operators mapping A 
into itself. We say that a state p is pure if it is a projection operator, i.e. ( p A ) 2 = p A . For a pure 
state p, we might use the pure state formalism, and represent p by the vector \p) it projects upon, 
i.e. p = \p)(p\] this is well-defined up to an irrelevant phase factor. 

A quantum channel from quantum register A into quantum register B is represented by a super¬ 
operator M a ^ b € C(A,B), with C(A,B ) the set of all completely positive, trace-preserving linear 
operators from 'D(A) into T>(B). If A = B, we might simply write Af A , and when systems are 
clear from context, we might drop the superscripts. For channels A f\ € C(A, B),Af 2 € C(B,C), we 
denote their composition as A /2 °A/i € C(A, C ), with action (A /2 oAf\)(p) = A /2 (A/"l (/o) ) on any state 
p € 'D(A). We might drop the o symbol if the composition is clear from context. For A and B 
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isomorphic, we denote the identity mapping as I A ^ B , with some implicit choice for the change of 
basis. For Af A ^ Bl (gi I A2_5>B2 g C{A\ < 8 > A 2 , B\ < 8 > B 2 ), we might abbreviate this as Af and leave the 
identity channel implicit when the meaning is clear from context. 

An important subset of C(A,B ) when A and B are isomorphic spaces is the set of unitary 
channels U{A,B ), the set of all maps U G C(A,B) with an adjoint map G C(B,A) such that 

o U = I A and U o = I B . More generally, if dim(F>) > dim(A), we denote by U(A,B ) the 
set of isometric channels, i.e. the set of all maps V G C(A,B) with an adjoint map V t G C(B,A) 
such that V' o V = I A . Another important example of channel that we use is the partial trace 
Ttb(-) G C(A <g> B,A ) which effectively gets rid of the B subsystem to obtain the marginal state 
on subsystem A. Fixing an orthonormal basis {|C}} for B, we can write the action of Tr^ on any 
p AB G T>(A <g> B) as Ttb(p AB ) = Yb (b I P AB l&)- Note that the action of Tr^ is independent of the 
choice of basis chosen to represent it, so we unambiguously write p A = Ttb(p AB )- We also use the 
notation Tr -,.4 = Tr^ to express that we want to keep only the A register. 

Fixing a basis also allows us to talk about classical states and joint states: p G T>(B) is 
classical (with respect to this basis) if it is diagonal in basis {16)}, i.e. p = YbPB^b) • |i>)( 6 | for 
some probability distribution pg. More generally, subsystem B of p AB is said to be classical if 
we can write p AB = YbPsib) ■ \b)(b\ B < 8 > p A for some p A G 'D(A). An important example of a 
channel mapping a quantum system to a classical one is the measurement channel A g, defined as 
Ab(p) = Yb (b I P I b) -\b)(b\ B for any p G T>{B). Note that for any state p G V(B\ ® B 2 <S> C R) of 
the form 


\p)B'B*CR = y/pB(b) • I b) Bl | b) B > I p b ) CR , 
b 

we have Tr b 2 (p B iB2CR ) = YbPB^b) ■ \b){b\ Bl ® p% R and Tr b 2 r(p B iB2CR ) = YbPs{b ) ■ \b)(b\ Bl <S p$, 
with the state on B\ classical in both cases. Often, A,B,C,- ■ ■ will be used to discuss general 
systems, while X. Y. Z, ■ ■ ■ will be reserved for classical systems, or quantum systems like B\ and 
B 2 above that are classical once one of them is traced out, and can be thought of as containing a 
quantum copy of the classical content of one another. 

For a state p A G T>(A), a purification is a pure state p AR G V(A®R) satisfying Tr^/h 4 ^) = p A . 
If R has dimension at least that of A, then such a purification always exists. For a given R, 
all purifications are equivalent up to a unitary on R, and more generally, if dim(i? / ) > dim(i?) 
and p AB ,p AR are two purifications of p A , then there exists an isometry V R ~^ R such that p AR = 
V p (p AR ). For a channel Af G C(A, B), an isometric extension is a unitary Uj\f G U(A, A 1 <S) B) with 
TAA'(UAf(p A )) = N(p A ) for all p A . Such an extension always exists provided A! is of dimension 
at least dirn(A) 2 . For the measurement channel A b, an isometric extension is given by C/a = 
Yb\b) B '\b) B (b\ B . 

The notion of distance we use is the trace distance, defined for two states pi,p 2 G 'D(A) as the 
sum of the absolute values of the eigenvalues of their difference: 

||pi-p 2 |U = Tr(|pi-p 2 |). 

It has an operational interpretation as four times the best bias possible in a state discrimination 
test between p\ and p 2 - The subscript tells on which subsystems the trace distance is evaluated, 
and remaining subsystems might need to be traced out. We use the following results about trace 
distance. For proofs of these and other standard results in quantum information theory that we 



use, see [Will3| . The trace distance is monotone under noisy channels: for any pi,p 2 £ P(j 4) and 

N € C(A, B), 

\\N(pi)-N(p2)\\b<\\pi-P2\\a- (1) 

For isometries, the inequality becomes an equality, a property called isometric invariance of the 
trace distance. Hence, for any pi,p 2 € 'D(A) and any U € U(A,B), we have 

nc/(p 1 )-c/(p 2 )ii B = iipi-p2m. ( 2 ) 

Also, the trace distance cannot be increased by adjoining an uncorrelated system: for any pi,p 2 € 

V(A),a € V(B) 

||pi <g> a - p 2 (8) cr|Us = ||pi - p 2 |U- (3) 

The trace distance obeys a property that we call joint linearity: for a classical system X and two 
states p XA = px{x) ■ \x)(x\ x <8> p A x and p XA = Px{x) ■ \x){x\ x <g> p A x , 

||pi - P 2 \\xA = ^2 px{x)\\pi, x - P2,x\\a- (4) 

X 

The measure of information that we use is the von Neumann entropy, defined for any state 
p € T>{A) as 


H{A) p = — Tr(plogp), 

in which we take the convention that 0 log 0 = 0, justified by a continuity argument. The logarithm 
log is taken in base 2, while the natural logarithm is denoted In. Note that H is invariant under 
isometries applied on p. If the state to be evaluated is clear from context, we might drop the 
subscript. Conditional entropy for a state p ABC € T>(A <g> B ® C) is then defined as 

H{A\B) = H(AB) - H(B), 

mutual information as 

I(A;B) = H(A)-H(A\B), 
and conditional mutual information as 

I(A\ B\C) = H(A\C) - H(A\BC). 

Note that mutual information and conditional mutual information are symmetric in interchange 
of A, B, and invariant under a local isometry applied to A,B or C. For any pure bipartite state 
p AB e T>(A <g> B ), the entropy on each subsystem is the same: 

H(A) = H(B). (5) 

Since all purifications are equivalent up to an isometry on the purification registers, we get that for 
any two pure states \4>) ABCR and \i^) ABCR such that cj) ABC = iJ: ABC , 

I(C-R'\B) <p = I(C-R\B) ll) . (6) 
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For isomorphic A,A', a maximally entangled state i/) € T>(A <S> A') is a pure state satisfying 
H(A) = H(A') = logdim(M) = log dim(d/). For a system A of dimension dim (A) and any p E 
T>(A <8 B <8 C), we have the bounds 


0 < H(A) < logdim(A), 

(7) 

—H(A) < H(A\B ) < H(A), 

(8) 

0 < I(A;B) < 2H(A), 

(9) 

0 < I{A-B\C) < 2 H{A). 

(10) 

If A or B is a classical system, we get the tighter bounds 


0 < H(A\B), 

(11) 

I{A) B) < H(A), 

(12) 

I(A; B\C) < H(A). 

(13) 

The conditional mutual information satisfies a chain rule: for any p E T>(A <8) B <8> C <8> D), 

I(AB ■ C\D ) = /(A; C|D) + I(B- C\AD ). 

(14) 

For product states p A ' BXA ' A ^ B ^ G 2 _ pMBiCi ^ pA 2 B 2 c 2 , en ^ r0 py additive, 


tf(A 1 A 2 ) = ff(A 1 ) + ff(A 2 ), 

(15) 

and so there is no conditional mutual information between product system, 


7(Ai; A 2 F>i.B 2 ) = 0, 

(16) 

and conditioning on a product system is useless, 


I(A 1 -B 1 \C 1 A 2 ) = I(A 1 ;B 1 \C 1 ). 

(17) 

More generally, 


/(AiA 2 ; F?i-B 2 |CiC 2 ) = /(Ai; Bi\Ci) + I(A 2 ; -B 2 |(7 2 ). 

(18) 

Two important properties of the conditional mutual information are non-negativity, equivalent 
to strong subadditivity, and the data processing inequality. For any p E V(A <8> B <8> C) and 
A f € C(B, B'), with a = J\T(p), 


I (A] B\C) P > 0, (19) 

I(A-,B\C) p >I(A-B'\C) a . (20) 

For classical systems, conditioning is equivalent to taking an average: for any p ABCX = ^ ~2 x px{x ) • 
|a;)(x|' Y <8) Px BC ) for a classical system X and some appropriate p x E V(A ® B <g> C), 

H(A\BX) p = J2 px(x) • H(A\B) Px , (21) 

X 

I (A) B\CX) p = ^2p x (x)-I(A-,B\C) Px . (22) 

X 
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3.2 Quantum Communication Model 

The model for communication complexity that we consider is the following. For a given bipartite 
relation T C X x Y x Z 4 x Zb and input distribution p on X x Y, Alice and Bob are given 
input registers Ai n ,Bi n containing their classical input x G X,y G Y at the outset of the protocol, 
respectively, and they output registers A out ,B out containing their classical output z A G Z a ,zb G 
Zb at the end of the protocol, respectively, which should satisfy the relation T. We generally 
allow for some small error e in the output, which will be formalized below. In this distributional 
communication complexity setting, the input is a classical state p = J2 x ex y eY m( x,y ) • \x)(x\ A ' m 8) 
\y)(y\ Bin , similarly for the output II(p) = Y, za &z a ,z b &z b Pz a z b {za,z b ) ■ \z A ){zA\ A ° ut ® \zb){z B \ Bout 
of the protocol II implementing the relation, and the error parameter corresponds to the average 
probability of failure J2 x , y P( x ,y) ■ [(x,y,U(x,y)) 0 R] < e. 

A r-round protocol II for implementing relation T on input p AinBin is defined by a sequence 
of isometries U\. - ■ ■ ,U r+ 1 along with a pure state ip G T>(Ta <S> Tb) shared between Alice and 
Bob, for arbitrary finite dimensional registers T a ,Tb ■ For appropriate finite dimensional memory 
registers Ai, A 3 , ■ • ■ A r _i, A' held by Alice, B 2 , B A , • • • B r _ 2 , B' held by Bob, and communication 
registers C\, C 2 , C 3 , ■ ■ ■ C r exchanged by Alice and Bob, we have U± G U(Ai n <g) T A , A\ <g) C\), U 2 G 
U(Bi n < 8 > Tb < 8 > Ci, B 2 < 8 > C 2 ), U 3 G U{A\ ® C 2 , A 3 ® C 3 ), U 4 G LKB 2 ® C 3 , B 4 ® C 4 ), ■ ■ ■ , U r G 
U(B r _2 < 8 > C r -±, B out <S> B' ® C r ), U r+ \ G U{A r -i <S> C r , A out < 8 > A 1 ). We adopt the convention that, in 
the first round, B\ = Bq = Bi n ®Tb, in even rounds Bj = Bj_i, and in odd rounds A{ = A,_i. In 
this way, in round i, after application of U t , Alice holds register Ai, Bob holds register Bi and the 
communication register is Ci. We slightly abuse notation and also write II to denote the channel 
implemented by the protocol, i.e. 

n(/3) = TT A 'B'{U r+1 U r - ■ ■U 2 U 1 (p^'ip)). (23) 

To formally define the error, we introduce a purification register R. For a classical input p A ^ B in = 
Y^xeXyeY y) ' \ x )( x \ Ain ® \y)(y\ Bin like we consider here, we can always take this purification 

to be of the form | p^ AinBinB — J 2 xex yeY Vp( x ^y) \ x ) Ain \y) Bin \ x v) Rl \ x v ) R2 1 for an appropriately 
chosen partition of R into R\,R 2 - Note that if we trace out the the R 2 register, then we are left with 
a classical state such that R\ contains a copy of the joint input. Then we say that a protocol II for 
implementing relation T on input p AmR ' m , with purification p A m B inR j i ias average error e G [0,1] 
if Pe =Pr Mi n[n(pAnSinih) ^ T] < e. We denote the set of all such protocols as T(T,p,e). If 
we want to restrict this set to bounded round protocols with r rounds, we write T r (T,p,e). The 
worst case error of a protocol is P™ = max^ Pe, in which it is sufficient to optimize over all atomic 
distributions p. We denote by T(T,e) the set of all protocols implementing relation T with worst 
case error at most e, and by T r (T, e) if we restrict this set to r-round protocols. 

Let us formally define the different quantities that we work with. 

Definition 3.1. For a protocol II as defined above, we define the quantum communication cost of 

II as 


QCC(U ) = ^logdim(Q). 

i 


Note that we do not require that dim(C 7 ;) = 2 k for some k G N, as is usually done. This 
will not affect our definition on information cost and complexity, but might affect the quantum 
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communication complexity by at most a factor of two, without affecting the round complexity. The 
corresponding notions of quantum communication complexity of a relation are: 

Definition 3.2. For a relation T C X x Y x Z 4 x Zb, an input distribution /j on X x Y and an 
error parameter e € [0,1], we define the e-error quantum communication complexity of T on input 
[i as 


QCC(T,fi,e)= min QCC(U), 
neT(7>,e) 

and the worst-case e-error quantum communication complexity of T as 

QCCiT, e) = min QCCCH), 
rieT(T,e) 

Remark 3.3. For any T, p, 0 < ei < e 2 < 1, the following holds: 

QCC(T, 11 , 62 ) < QCC(T, p,e\), 

QCC{T , e 2 ) < QCC(T,e 1 ). 

We have the following definitions for bounded round quantum communication complexity, and 
a similar remark holds. 

Definition 3.4. For a relation T C X x Y x x Z#, an input distribution p on X x Y, an error 
parameter e € [0,1] and a bound r € N on the number of rounds, we define the r-round, e-error 
quantum communication complexity of T on input p as 

QCC r (T,n,e)= min QCC(U), 

U£ T r (T,n,e) 

and r-round, worst-case e-error quantum communication complexity of T as 

QCC r (T, e) = min QCC(IL), 

neT r (T,e) 

3.3 Quantum Information Complexity 

We use the notion of quantum information complexity as defined in [Tout 5] , The register R is the 
purification register, invariant throughout the protocol since we consider local isometric processing. 
Note that, as noted before when considering a R 1 R 2 partition for R, for classical input distributions, 
the purification register can be thought of as containing a (quantum) copy of the classical input. 
The definition is however invariant under the choice of R and corresponding purification. 

Definition 3.5. For a protocol II and a state p with purification held in system R, we define the 
quantum information cost of II on input p as 

QIC(U,p)= J2 \l(C i -,R\B i )+ ]T ±I(CiiR\Ai). 

i>0,odd i>0,even 
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Definition 3.6. For a relation T C X xY x Za x Zg, an input distribution ^ on I x 1", an error 
parameter e € [ 0 , 1 ] and a number of round r, we define the e-error quantum information complexity 
of T on input p as 


QIC(T, p, e) = inf QIC(U,p), 
neT(T,/j,e) 

and the r-round, e-error quantum information complexity of T on input p as 

QIC r (T, p,e) = inf QIC(U,p), 

U&T r (T,fi : e) 

The following properties of quantum information cost and complexity were proved in Ref. |Toul5j . 
Lemma 3.7. For any protocol II and input distribution p, the following holds: 

0 < QIC(U,p) < QCC(U). 

Lemma 3.8. For a relation T C X x Y x Za x Zb, an input distribution p on X x Y, an error 
parameter e € [0,1] and a number of round r, the following holds: 

0 < QIC(T, p , e) < QCC(T, p, e), 

0 < QIC r (T, p, e) < QCC r (T,p,e). 

Lemma 3.9. For any two protocols II 1 and U 2 with n and r 2 rounds, respectively, there exists a 
r-round protocol II 2j satisfying ID = II 1 (g)II 2 , r = max(ri, r 2 ), such that the following holds for any 
corresponding input states p 1 , p 2 : 

qic(h 2 ,p 1 ® p 2 ) = Qicfn 1 , p 1 ) + qic{u 2 , p 2 ). 

Lemma 3.10. For any r-round protocol II 2 and any input states p 1 € V(A\ n ® B\ n ), P 2 G V(A 2 n ® 
B 2 n ), there exist r-round protocols II 1 , II 2 satisfying II 1 (-) = Tr ^2 B 2 oIl 2 (- <S) p 2 ) , II 2 (-) = 
Tr 41 01 0 II 2 (/O 1 <Z> ■), and the following holds: 

Qic( n\ p 1 ) + qic( n 2 , p 2 ) = qic(h 2 ,p 1 ® p 2 ). 

Lemma 3.11. For any p G [0,1], any two protocols II 1 ,II 2 with r\,r 2 rounds, respectively, there 
exists a r-round protocol II satisfying II = pll 1 + (1 —p)Il 2 ,r = max(ri, r 2 ), such that the following 
holds for any state p: 


QIC{ n, p) = pQic(u\p) + (1 - p)Qic{u 2 ,p). 

Corollary 3.12. For any p € [0,1], T and e, £ 1,62 € [0,1] satisfying e = pei + (1 — p)e 2 , for any 
bound r = max^i,^),^,^ € N on the number of rounds and for any input distribution p on 
X x Y, the following holds: 

QIC(T , p, e) < pQIC(T , p , ei) + (1 - p)QIC(T, p, e 2 ), 

QIC r (T, p, e) < pQIC ri (T, p, ei) + (1 - p)QIC r2 (T, p, e 2 ). 
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Lemma 3.13. Let v be a distribution over input states p and denote p := E pr ^ v p. Then for any 
protocol 7r, 

E p ^[QIC(Tr,p)]<QIC{ir,p) 

Lemma 3.14. For any r-round protocol II, any input distribution p with copies of x,y in R\, and 
any e € (0, 2], 5 > 0, there exists a large enough no(II, p, e, 5) such that for any n > no, there exists 
a r-round protocol Ii n satisfying 

-QCC(n„)<QIC(U,p) + S. 

n 


3.4 Generalized Discrepancy Method 

Generalized discrepancy method, also known as smooth discrepancy method, is one of the strongest 
methods for proving lower bounds for quantum communication. 

Definition 3.15. Let / : X x y —»■ {0,1} be a boolean function. The 5-generalized discrepancy 
bound of /, denoted by GDMs(f), is defined as: 

GDM$(f) = max{GDM[(/): p a distribution over X x T} 

GDMg(f) = max {log ( \ , g : X x y -> {0,1} Pr [f(x, y) A g(x, y)] < <5} 

\disc'(,g) / (s,i/)~m 

disc M (g) = max < | E (_l)s(*,v) .p(x,y)\ : ReU i 

[ {x,y)&R ) 

Here R is the set of combinatorial rectangles A x £>, A C X, B C y. We state two results on the 
generalized discrepancy method, both due to Sherstov |She071 IShel 2 j . which we will use to lower 
bound the quantum information complexity of disjointness. The first is a threshold direct product 
result that will be useful to prove that the generalized discrepancy method is a lower bound on 
the quantum information complexity of boolean functions, and the second is a lower bound on the 
generalized discrepancy for the disjointness function. 

Theorem 3.16 ( [Shel2j ). Let e s h > 0 be a small enough absolute constant. Then for any boolean 
function f , the following communication problem requires H(nGDM 1/ / 5 (/)) qubits of communi¬ 
cation (with arbitrary entanglement): Solving with probability 2~ eahn , at least (1 — e s h)n among n 
instances of f. 

The disjointness function is defined as follows: for x,y G {0, l} n x {0, l} n , DLSJ n (x,y ) = 1 if 
for all i £ [n\, Xi A yi = 0, and 0 otherwise. We will need the following theorem. 

Theorem 3.17 f |She07p . GDM 1/5 (DLSJ n ) > Sl(^n) 

4 Properties of Quantum Information Complexity 

In this section, we prove general results about quantum information complexity that we use to 
obtain the main results. These may be of independent interest. 


14 











4.1 Prior-free Quantum Information Complexity 

We want to define a sensible notion of quantum information complexity for classical tasks. Like 
in the classical setting: (Bra there are two sensible orderings for the optimization over inputs 
and protocols. We provide the two corresponding definitions and then investigate the link between 
them. We denote by T>xy the set of all distributions pi on input space X x Y. 

Definition 4.1. The max-distributional quantum information complexity of a relation T with error 
e € [0,1] is 


QICo(T,e)= max QIC(T,n,e). 

MG V X y 

When restricting to r-round protocols, it is 

QIC r D (T, e) = max QIC r (T,n,e). 

mg T>xy 

Definition 4.2. The quantum information complexity of a relation T with error e £ [0,1] is 

QIC(T,e) = inf max QICCH, pi). 
neT(T,e) MGX>xv 

When restricting to r-round protocols, it is 

QIC r (T,e) = inf max QIC(Jl,pi). 
nsT r (T,e) u-GlDxy 

Lemma 4.3 (Information lower bounds communication). For any relation T, error parameter 
e £ [0,1], and number of rounds r £ N, the following holds: 

QIC r (T, e) < QCC r (T, e), 

QIC(T, e) < QCC{T , e). 

Proof. Let II be a protocol computing T correctly except with probability e on all input and 
satisfying QCC(H) = QCC(T, e). We get the result by noting that QIC(T, e) < max ;i QIC( II, pi) < 
QCC{ n). □ 

Clearly, QICd{T, e) < QIC(T , e), and QIC r D (T , e) < QIC r (T, e). We prove that we can almost 
reverse the Quantifiers. The Droof idea follows the lines of the nroof of Theorem 3.5 in Ref. [Bra 12], 
but special care must be taken for quantum protocols. The idea we use is to take an e-net over 
T>xy, and then take a 5-optimal protocol for each distribution in the net. To extend this result to 
the unbounded round quantum setting, we adapt a compactness argument from Ref. |BGPW13a] . 
itself adapted from Ref. [Ter72| . The following results will be used. 

Lemma 4.4 (Continuity in average error). Quantum information complexity is continuous in the 
error. This holds uniformly in the input. That is, for all T, r and e, 5 > 0, there exists e' £ (0, e) 
such that for all e" € (e e) and for all pL, 

\QIC(T, fi, e — e") — QIC(T, fi, e)| < 6 , 

| QIC r (T, pi, e - e") - QIC r (T, //, e)| < <5. 
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Proof. Note that we can drop the absolute values and also work at e' since quantum information 
complexity is non-increasing in the error, i.e. QIC(T , p, e) < QIC(T, p,e — e") < QIC(T, p,e — e'). 
Let 0 < p < \ and use Corollary 13.121 with e\ = 0,62 = e, d = pe for the current e. We get 


QIC{T, p, e-e')< pQIC(T, p, 0) + (1 - p)QIC{T, p, e) 
< pQCC(T, 0) + QIC(T, p, e). 


Rearranging terms, we get 

\QIC(T, p, e - e') - QIC(T, p, e)| < - QCC(T , 0). 

e 

This bound is independent of p, and goes to zero as p and e' do, so the result follows. The bounded 
round result is proved in the same way, obtaining QCC r (T , 0) in the final bound instead. □ 

Lemma 4.5 (Convexity in error). For any p E [0,1], T and e, ei, 62 E [0,1] satisfying e = pe i + (1 — 
p)e 2 and for any bound r = max(ri, r 2 ), rq, r 2 E N on the number of rounds, the following holds: 

QIC(T , e) < pQIC(T, ei) + (1 - p)QIC{T, e 2 ), 

Q/C r (T, e) < (T, ei) + (1 - p)QIC r2 (T, e 2 ). 

Proof. The proof is similar to the one for the analogous result with fixed input. Given 5 > 0, let II 1 
and II 2 be protocols satisfying, for all p, for i € {1,2}, II* € T(T, ej), //) < QIC(T , e*) + 5, 

and take the corresponding protocol II of Lemma 13.111 First, it holds that protocol II successfully 
accomplish its task, i.e. it implements task T on all inputs with error bounded by e = pe i + (1 —p)e 2 . 
We must now verify that the quantum information cost satisfies the convexity property: 

QIC(T, e) < max <57(7(11, p) 

= max (pQICiU 1 , p) + (1 - p)QIC(U 2 , p)) 

< pm&xQICfn 1 , p) + (1 — p) m&xQIC(Il 2 , p) 

< pQIC(T, ei) + (1 - p)QIC(T, e 2 ) + 26. 

Keeping track of rounds, we get the bounded round result. □ 

Corollary 4.6 (Continuity in error). Quantum information complexity is continuous in the error. 
That is, for all T, r and e, 8 > 0, there exists e' € (0, e) such that for all e" € (e', e) 

\QIC{T,e-e")-QIC(T,e)\ <6, 

\QIC r (T, e - e") - QIC r (T, e)| < <J. 


Lemma 4.7 (Quasi-convexity in input). For any p E [0,1], define p = pp\ + (1 — p)p 2 for any two 
input states p\,p 2 ■ Then the following holds for any r-round protocol LL 

QIC(U, p) > pQIC{U, pi) + (1 - p)QIC(U , P2 ) 

QIC(U, p) < pQIC(U, Pl ) + (1 - p)QIC(U, pa) + rH(p). 
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Proof. The first inequality is Lemma 13.131 and the second is obtained by keeping track of the 
remainder terms discarded in its proof. Let R be a register holding a purification of p\ and p 2 , then 
we can purify p with two copies S \ . S 2 of a selector reference register, such that \p) ™ in 1 2 = 
y/p\pi) AinBinR |l) Sl ll) 5 " 2 +\A — P \p 2 ) AinBinR |2) Sl I2) 52 . We can then expand each term as 

I(CpRS 1 S 2 \B i ) p = I(Ci ; S^Bjp + I(Cp R\BiSi) p + I{Cp S^RSf),,, 

and similarly for terms conditioning on Alice’s systems Aj. The result follows by summing over all 
rounds since 


IiC^RlBiSjp = P I(C i -R\B i ) pi + (1 -p) • IiCi-RlBi)^, 

and then H(S) = H{p) upper bounds the two remainder terms in each of the r rounds. □ 

Lemma 4.8 (Continuity in input). Quantum information cost for r-round protocols is uniformly 
continuous in the input distribution. This holds uniformly over all r-round protocols over input 
X x Y. That is, for all r , |X|, |Yj, and e > 0, there exists 5 > 0 such that for all p\ and p 2 that 
are 5-close and all r-round protocols II, 

\QIC(U, m) - QIC{U, p 2 )\ < e. 

Proof. Let 5 > 0 and fix p± and p 2 that are (5-close. We can then write, for some common part po 
and remainder parts p ! x , p ' 2 , 


pi = (1 - 5)po + 5p \, 

P 2 = (1 - 5)po + <5/4, 

, min(/ii (x,y),p 2 (x,y)) 

mX,V) Exwmin(w (x',y'),p 2 (x',y')Y 

Using the bounds in the lemma above once on each of p\ and p 2 , we get 

QIC(U, pi) < (1 - 5)QIC{U, po) + 5QIC(U , p \) + rH{5) 

< (1 - <5)Q/C(n, po) + 5QIC(U, p' 2 ) + 5QIC(U, p\) + rH{5) 

< QIC(U, p 2 ) + 5 ■ r(log \X\ + log \Y\) + rH(5). 

Similarly, we get a bound on QIC(H, p 2 ) in terms of QIC(Il, pi), so the following holds: 

\QIC(U, pi) - QIC( n, p 2 )\ < 5 ■ r( log \X\ + log \Y\) + rH(5). 

This bound is independent of pi,p 2 , depends on II only through r and |AT|, |T|, and goes to zero 
as 5 does, so the result follows. □ 

Corollary 4.9. Suppose we have a r-round protocol II for AND. Then, 

QIC(U, p) < QIC( n, po) + 0{rH{w)) (24) 

where w = p( 1,1) < 1/2, p 0 ( 1,1) = 0, and po{xi,yi) = -A^h( x i,yi) otherwise. 

Proof. This just follows from the proof of lemma (THl since the input size is constant. □ 
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Theorem 4.10. For a relation T C X xY x Z\ x Zb, an error parameter e G (0,1), a number of 
rounds r and each value a. G (0,1), 


QIC r (T ,-) < 
a 


QIC r D (T, e) 
1 — a 


Proof. Fix T, r, e, a and denote I = QIC r D {T , e). For any <5i € (0,1), we want to prove the existence 
of a protocol II G T r (T, ^ • (1 + 2<5i)) satisfying QIC{ II,//) < 7 for all n G Vxy- This shows 
that QIC r (T, — ■ (1 + 2#i)) < ■ (1 + 2<5i), and then by continuity of quantum information 

complexity in the error, we get the result by taking <5i to 0. The proof follows along the lines 
of the one for the analogous result for classical information complexity [Bral2], using a minimax 
argument. We take extra care to account for the continuum of quantum protocols, the round-by¬ 
round definition of quantum information cost, and the fact that we do not have a bound on the 
size of the entanglement. Let 62 G (0, e<5i) satisfy the following two properties for all that 

are (^-close, and for all r-round protocols II: 

\QIC(U,^)-QIC(U,i, 2 )\ </-^, (25) 

\QIC r (T, p,i,e — S 2 ) - QIC r (T, ni,e)\ < I • (26) 

The first inequality is possible by Lemma [4.81 i.e. by the uniform continuity of quantum information 
cost in the input, uniformly over all r-rounds protocols, and the second is possible by Lemma 14.41 
i.e. the continuity of quantum information complexity in the error, uniformly over all inputs. Fix a 
finite c^-net for T>xy , that we denote Nxy- For each fj, G Nxy , fi x a protocol 11^ G T r (T, /a, e — 62 ) 
such that QlCfU^n) < QIC r (T, fi,e — 62 ) ■ (1 + yg) and denote the set of all such protocols Pjv- 
We then have \Pn\ = | Wy| < 00 , and we get using (126)) that 

Qic{ n M , fx) < Qic r (r, /i, e - <y 2 ) ■ (1 + ^) 

< {QIC r (T^,e) + I-^)( 1 + ^) 

</(l + -) 2 
“ V 10 J 

<7(1 + |). (27) 

We define the following two-player zero-sum game over these two sets. Player A comes up with a 
quantum protocol II G Pn ■ Player B conies up with a distribution p, G Nxy • Player P’s payoff is 
given by 

n tTT \ t-i \ QIC{U,fi) Pr /t [n T] 

Pb(II, h) = (1 - a) - - -h a --, 

and then player T’s is given by Pa(II,^) = — Pb(II, p). We first show the following. 

Claim 4.11. The value of the game for player B is bounded by 1 + <5i. 


18 






Proof. Let vb be a probability distribution over Nxy representing a mixed strategy for player B. To 
prove the claim, it suffices to show that there is a protocol II £ Pm such that E V[1 [Pq(II, /i)] < l+di. 
Let p be the distribution corresponding to averaging over that is 

p{x,y) = E„ B n(x,y). 

Let y! £ Nxy be a distribution that is (^-close to p, and IT € Pm the corresponding protocol. We 
will show that IT is also good for p. We first have 

TV 7 -JIT T] < Pr^TP (fT]+8 2 
< e — 62 + 62 
= e, 


in which the first inequality follows from the fact that p and y! are ^-close and the second inequality 
from the fact that II' £ Pm is the protocol corresponding to y! £ Nxy , he. IT € T r (T, y!,e — 62 )- 
We also have 

QIC(W, p) < QIC( IT, /i') + I ■ y 
</•(! + <5i), 


in which the hrst inequality follows from (1251) and the second from the fact that IT £ Pm is the 
protocol corresponding to y! € Nxy along with (|27l) . We obtain 


E„ B [P B (U', y)\ = E Ub [(1 - a ) • 


QICQV,n) , Pr^U'^T}- 


+ Oi. ' 


n , { QIC(If,y), , , Pr fj {B' <£ T] 

= (!-«)• e »b [ - J -J + «--- 

w, , r QIC(U f ,p), , Pr^U'^T] 

<(!-«)• L- J -J + a - e - 

< (1 — Of) • (1 + (5 i) + OL 

< 1 + 5i, 


in which the hrst equality is by definition, the second by linearity of expectation, the hrst inequality 
is by Lemma 13.131 i.e. concavity of quantum information cost in the input state, and the second 
inequality is by the above results about IT. This concludes the proof of the claim. □ 

By the minimax theorem for zero-sum games, the above claim implies that there exists a 
probability distribution va over Pm representing a mixed strategy for player A and such that 
the value of the game for player B is at most 1 + <5i. That is, for all y £ Nxy > 

e ua (Pb (n, n)) < 1 + <5i. 

Let n = E„ A (II) be the r-round protocol obtained by publicly averaging over va, as per 
Lemma 13.111 This is the protocol we are looking for. The following claim holds. 

Claim 4.12. For all y £ V XY , (1 - a) ■ QIC f I ’ m) + a • Pr ^ T] < 1 + 2<5i. 
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Proof. Fix any fi E T>xy, and let n' E Nxy be a distribution that is 5 2 -close to fi. Then we obtain 


,, , QIC(TL,fi) , PrfjfU. ^ T\ , QIC{%n') + ISi , P V [n0T] + 5 2 

(1 — aj •---b a • — - -< (1 — a) - - 7 -b a ■ —-- 


I 


_ I 

„ ^ Q/Cffl./x') , _ „ Pr^U^T] 

= (1 - a) •---b a ■ K Ua - 


I 

-b (1 — 0 ) • dr ~b ol 


<52 


W1 ^ 7c 1 ^ r FvP^] 1 , , 

< (1 — a) • E„ A [ - - -J + a ■ E ua [ ---J + 01 

= E VA [P B (U,n , )] + 6 1 

< 1 + 25 \, 


in which the first inequality follows from (|25D and the fact that fi, p,' are <5 2 -close, the first equality 
is because we take expectation over a probability, the second inequality is because 5 2 < e • <5i and by 
Lemma l3.Hl i.e. by the convexity of quantum information cost in the protocol, the second equality 
is by linearity of expectation and the definition of P B (Jl, pf), and the last inequality is because va 
represents the mixed strategy obtained by the minimax theorem. Since this holds for all /a € T>xy, 
this conclude the proof of the claim. □ 

To conclude the proof of the theorem, we first note that the above claim implies that for all 
M € V XY , 

QIC(U,/j) < —— (l + 2Si), 

1 — a 

so II satisfies the quantum information cost property we are looking for. Is left to verify that it 
also has low error on all inputs. The above claim also implies that for all /r, 

Pr fl [tl^T]<--(l + 2S 1 ). 
a 

Letting p run over all atomic distributions, we get the desired error property, and so 

QIC r (T , - ■ (1 + 2<5i)) < —— (1 + 250, 

a 1 — a 


as desired. 


□ 


Theorem 4.13. For a relation T C X x Y x Za x Zb, an error parameter e € (0,1) and each 
value a € ( 0 , 1 ), 


QIC(T, —) < 
a 


QIC D (T,e ) 
1 — a 


Proof. Let I = QICp(T,e), and denote by Pe(H) the average error of LI for computing T on p, 
and by Pp the set of all protocols over the same input and output spaces as T. Then for any II, 
(II) is continuous in p by properties of the statistical distance. Given 5 > 0, define 


T(II) = {// E V XY : QIC{ n, p) > / + 2 • 5 or P^(U) > e + 5}. 
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By continuity of QIC{ n, p) and Pe(Tl) in p, these sets are closed for all II € Pt- Then, by definition 
of I, for all p there exists II M € T(T, p, e) such that QIClJl^, p) <1 + 6 , and so nnsP T A(II) = 0. 
Since T>xy is compact and the sets 4.(11) are closed, we get that there exists a finite set Q C Pt 
such that HneQ^4(n) = 0. We get that for all p, there exists II /t € Q such that QIC(Jlp) <1 + 25 
and Pe(P-ji) < e + 5. Let = max{r : there is II £ Q with r rounds }, then 

I + 25 > max min QIC(H, p) 

V IIeQnT(T,/x,e+5) 

>QIC r D M (T,e + 6 ) 

> (1 -a)-QIC rM (T,- + -) 

a a 

C 

> (1 - a) ■ QIC(T, - + -). 

a a 

The result follows by continuity of QIC and by taking 5 to zero. □ 


4.2 Subadditivity 

Lemma 4.14. For any two protocols II 1 ,II 2 with r±,r 2 rounds, respectively, there exists a r-round 
protocol II2, satisfying II2 = II 1 <S> II 2 , r = max(ri, r^), such that the following holds for any joint 
input state p 12 € V(A} n <g> Bj n ® A 2 in ® B? n ): 

QIC( n 2 , Pl2 ) < QIC(U\ Pl ) + QIC(U 2 ,p 2 ), 

with pi = Tr A 2 B 2 (p 12 ) and p 2 = Tr A i B i (p 12 ). 

m m m m 

Proof. Given protocols II 1 and II 2 , we assume without loss of generality that rq >r 2 , and we define 
the protocol II2 in the following way. 

1. Run protocols II 1 , II 2 in parallel for r 2 rounds, on corresponding input registers Aj n , Bf n , A 2 n , B 2 n 
until II 2 has finished. 


2. Finish running protocol II 1 

3. Take as output the output registers A l out , Bl ut , A 2 out , B 2 ut of both II 1 and II 2 . 


It is clear that the channel that II2 implements is II2 = II 1 <8> II 2 , and the number of rounds 
satisfies r = max(n,r2), so is left to analyze its quantum information cost on input p\ 2 . Let R\ 2 

be a purifying register such that Pi 2 " Btn inBznRl2 is a pure state. Also, denote the purified joint 
state in round i as (p l ^ A \B\clA?B? C?Ri 2 ^ anc [ the } oca j state for protocol II 1 as 

( P \+ B ' c ‘ = TUw Ria ((pi 2 A B;c ^ fB?c?A u 


(28) 

and similarly for that of protocol II 2 . Notice that for all i, (p\) A ^ B i c i is purified by (p\) A ^ B ^ A ^n B in Rl2 (, 

rji2 J^2 

(j) 2 A B , with Af n Bf n Ri 2 the registers of state P12 before application of the unitaries corresponding to 
II 1 , and (f > 2 is the pure entangled state used in II2. If we denote, for i > r 2 +l,A 2 = A 2 mt +){A') 2 , B 2 = 

<S> {B') 2 , then by the definition of QIC and application of chain rule, 


R2 
out 


2-QIC(U 2 ,p 12 )= J2 I(C}C?;R 12 \B}B 2 ) Pi2 


+ 


r 2 

E 


I«C?-,R u \A]At 


P 12 


i=l,i odd 


i=lji even 
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r i 


+ ^ I{C}-R l 2 \B}B‘i) f>12 + Y. I(Cl-R 12 \A}Aj) m 

i=r 2 +l,i odd i=T 2 +l,i even 

= jr I(Cl,R 12 \B}B?C}) Pl2 + Y I(C?;Ri2\A}A*C }) Pl2 

i=l,i odd i=l,i even 

r\ r\ 

+ Y I(Cl;R 12 \BlBf) pi2 + Y I(ChR 12 \A}A*) pi2 . 

i=l,i odd i=l ,i even 

Now for protocol II 1 , as noted above, the registers Af n Bf n Ri 2 T^T B purify (p\) A i B i c i for all i, 


so 


r i 


ri 


2 ■QIC{H\p 1 )= Y I(ChAlBf n R 12 TlT^\B}) pi + Y ^ C h A\ n Bl n R X2 T\T 2 B \A\\ 

z—1 ,2 even 


P l 


i=l,i odd 
ri 


r i 


)p 12 


= Y I (C }; A? Bf Cf Ri 2 \B} ) pi2 + £ /(C' 1 ;A 2 ^ 2 C'fl?i 2 |A 1 ) 

2=1,2 odd 2=1,2 even 

ri ri 

= £ PC 1 ; Bf\Bj) pi2 + ^ /(C' 1 ;^ 2 |A 1 ) 

2=1,2 odd 2=1,2 even 

r\ r i 

+ ]T I(C' 1 ; j R 1 2|B 1 S 2 ) P12 + £ /(C' 1 ; j R 12 |A 1 ^ 

2=1,2 odd 2=1,2 even 

ri ri 

+ ^ I(C}]A*C?\B}BfRi 2 ) pi2 + X] 

2=1,2 even 


P 12 


)p 12 


P12 


2=1,2 odd 
r i 


r i 


> Y I(ChR 12 \B}B?) pl2 + Y PChR l2 \A}A}) pi2 , 

2=1,2 odd 2=1,2 even 

in which the first equality is by definition, the second is by isometric invariance of the conditional 
quantum mutual information (CQMI), the third by the chain rule for CQMI, and the inequality is 
by non-negativity of CQMI. Similarly for protocol II 2 , with a slightly different application of the 
chain rule, we get 


V 2 


r 2 


2 • QIC{U 2 ,p 2 ) = Y I(C?;AlB} n R 12 T\T 1 B \B?) P2 + Y ^ C i 5 A l B } n R 12 T\Ti\A 

2=1,2 odd 2=1,2 even 

z*2 2*2 

= X] I{ChA}B}C}R l2 \B‘>) pi2 + Y I{ChA}B}C}R 12 \A. 

2=1,2 odd 2=1,2 even 

2*2 2 " 2 
= x: /(C?;#C?|S?) Pl2 + x: nchA\Cl\Aj 


I)p 2 


1 2 )p 12 


/ Pl2 


2=1,2 odd 
T 2 


2=1,2 even 
2*2 


+ x: 2 (C' 2 ; J Ri 2 | J B 1 BfC 1 ) pi2 + x: 2(C' 2 ; j Ri 2 | j 4 1 ,4 2 C' 1 ; 

2=1,2 even 


P12 


2=1,2 odd 


22 


V2 7*2 

+ E I(C?;A}\BlB?C}R 12 ) P12 + E I(C?;BMA?C}R 12 ) p12 

i=l,i odd i=l,i even 

> E I(Cf-R 12 \B}BfC}) Pl2 + E I(ChRi2\A}A*Cl) Pl2 . 

i=l,i odd i=l,i even 

The result then follows by comparing terms. □ 

4.3 Reducing the Error for Functions 

Similarly to communication, it is possible to reduce the error when computing functions without 
increasing too much the information. 

Lemma 4.15. For any function f and error parameter e > 0, the following holds: 

QIC(f,e) < 0( log 1/e • QIC(f, 1/3)). 

Proof. Given <5 > 0, let II be a protocol computing / correctly except with probability 1 /3 on every 
input and satisfying QIC(H, g.) < QIC(f, 1/3) + 6 for all //. Let n € 0 (log 1/e) be given by the 
Chernoff bound such that protocol II n running II n times in parallel as per Lemma 14.141 with each 
input being a copy of the instance to /, and taking a majority vote (with arbitrary tie-breaking) 
computes / correctly except with probability e on every input. This n can be chosen independently 
of 5. We now argue on the quantum information cost of II n . Consider an arbitrary distribution /i 
for /, and let p n be the distribution once the n copies have been made. If we denote the marginal 
for the z-th copy by /T, then // = p. By Lemma 14.141 and an easy induction, we then get that 

QIC(f,e)<QIC{U n ,ia n ) 

< nQIC{ n,/i) 

< n(QIC(f, 1/3) + 6 ). 

The result follows by taking 5 to 0. □ 

4.4 Reduction from DISJ to AND 

With the following definition, the above proof also establishes the following corollary. 

Definition 4.16. For all r € N,e € [0,1], 

QICfiAND^) = inf max QIC(Il, fiA, 

Yl£T r {AND,e) MO 

in which the maximum ranges over all satisfying /xo(l, 1) = 0. 

Corollary 4.17. For any e > 0 and r € N, 

QICq(AND, e) < 0( log 1/e • QICf(AND, 1/3)). 

We provide a slight variant of the argument of |Toul5| to obtain a low information protocol for 
AND from a protocol for disjointness. 
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Lemma 4.18. For any n,r,e and po such that po(l, 1) = 0, 


inf QIC{ n A ,/io)< inf - QIC{U D ,p 

Il A £T r (AND,e) U D eT r (DISJ n ,e) n 


Proof. Let I n = inf n£)g T r (DisJ n ,e) QIC(I^d, Pq 11 )- We prove the result by induction on n. The base 
case is trivial since DISJ± = -> AND, and so a protocol to compute DISJ\ with error e can be used 
to compute AND with error e and vice-versa. In particular, we get I\ = infn 4e 7 - r (A/VD,e) QIC (Ha, po)- 
For the induction, suppose the result holds for DISJ n - 1 , we will use Lemma IB.lOl to go from DISJ n 
to DISJ\ and DISJ n -\. Indeed, given 5 > 0 and II/) computing DISJ n with error e and satisfying 
QIC(Hd, Pq 11 ) < In + 5, we can use Lemma 13.101 with p\ = po,p 2 = l and then it is clear 
that II 1 computes DISJ\ with error e and n J computes DISJ n -1 with error e. We get 

In + 5>QIC( IWD 

= Qic(n\ to) + Qicfn 2 ,^- 1 ) 
p I\ + In—1 

> nil. 


□ 

The following lemma is very similar to Theorem 14.101 The only difference is that the distribu¬ 
tions we consider are restricted and on the right hand side the error of the protocol is measured in 
the worst case. Since the error is worst case, there is no loss in the error, and the payoff function 
would be simply Pr(A. p) = QIC(H, p.)/I. 

Lemma 4.19. 

QICh(AND,e)= max inf QIC(Il, po) 

no,uo(iA)=oneT r (AND,e) 

Lemma 4.20. For all r, n € N, 

QCC r (DISJ n , 1/3) > n ■ QIC r 0 (AND, 1/3) 

Proof. The result follows from the following chain of inequality: 


QCC r {DISJn, 1/3) > QIC r (DISJ n , 1/3) 


> max inf 

Mo n£)E7~ r (D/5J n ,l/3) 


Qic(n D ,p® n ) 


> max inf n • QICCHa, Po) 

no n A &T r (AND,l/3) 

> n ■ QICq(AND, 1/3). 


The first inequality is by Lemma 14.31 the second since, on the r.h.s., the maximization is over a 
smaller set of product distributions with po(l, 1) = 0 and the minimization over a larger set of 
protocols, the third is by Lemma 14.181 and the last is by Lemma 14.191 □ 
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5 Lower bound on QIC by generalized discrepancy method 

5.1 Compression 

Definition 5.1. We say that QCC{f k , p k ,rpk,p 2 ) < C if there exists a protocol it for f k s.t. 
QCC(tt) < C and 

Pr[ tt computes > iftk coordinates correctly] >1 — 772 

Here the probability is both over the distribution p k and the randomness of protocol (which includes 
the randomness due to quantum measurements). We don’t require the protocol to declare which 
coordinates were computed correctly. 

Lemma 5.2. If there exists a protocol n for f with error < e w.r.t 7/ s.t. QIC( n, p) = I, then for 
all e', 5 > 0, there exists fco(H, p, e 5) such that for all k > k 0 , QCC(f k , p k , (1 — 2 e)k, e~ 2e2k + e') < 
k{I + S). 

Proof. Suppose (E±,... ,E k ) is the vector of indicator random variables of the errors in various 
coordinates of i.e. Ei = 1 if error occurred on the 7 th coordinate. Also look at obtained 
from lemma 13.141 for large enough k with parameters 2c', 5 and where p is p. Suppose (E 1 ,..., E k ) 
is the vector of errors for n^. According to lemma 13.141 satisfies the following: 

E ((xi,... 1 * fc ) 1 (i/i,...,i/fc))~Aifcll n fc((^i, • • •, x k ), (771, ■ ■ ■, Vk )) - n 0fc ((xi,..., x k ), ( 2 / 1 , •••, y k ))\\i < 2e' 
Hence it follows that 

\\(E 1 ,...,E k )-(E' 1 ,...,E' k )\\ TY <e’ 

Here ||P — <511 tv is the total variation distance between the distributions P and Q (we are not 
distinguishing between random variables and their distributions). Since PrA) > 2 ek\ < e _2e k 
by Chernoff bounds, it follows that 


Pr 




< e~ 2e k + e' 


which implies the lemma along with the fact that QCC(Tik ) < (I + 5)k. 


□ 


5.2 Average case to worst case 

In this section, we prove the following lemma which turns a protocol for average case input to a 
protocol for worst case input. 

Lemma 5.3. Suppose f n : {0, l} n x {0, l} n —» {0,1} is an arbitrary boolean function. Let k > 2 5n 
and e > 10£>°' 005 . Assume for any product input distribution p k , there exists a protocol 71■ *, with 
QCC {jr^k) < l that computes at least (1 — a ) k coordinates of f k correctly with probability at least 
7 . Then there exists a protocol r s.t. for any input ((xi,--- , Xk), (yi, • • • ,yk))> f or any integer 
c > 3 and constant e > 0, r computes at least (l — 2 -c / 2 — ca ) k coordinates of f k correctly with 

probability at least \ (( 7 J )fc ) ~ ^ _22 2 ck ^j- Also QCC (r) < c- l + o{k). 
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Proof. In this lemma, we want to construct a protocol r which works for an arbitrary input based 
on protocols which work on product input distributions (product across coordinates). The main 
idea of the proof is that corresponding to any input ((aq, ...,Xk), (yi,... ,yk )) (Xi and y, are inputs 
of a f n instance and have n bits), we can associate a y, which is the empirical distribution: 

# of i,(xi,yi) = (x,y) 

P(x,y) = -£-• 

So it makes sense to construct r from ir^k. The players can simulate y fc by sampling independent 
coordinates from their input (with replacement). However the issue is that the players don’t know 
y, so they have no idea what ir^k is. So in the actual protocol Alice and Bob will first sample some 
coordinates to get an estimate fi of y and then run protocol ir^k. The protocol t is described in 
Protocol [0 

Inputs: (xi,..., x k ) and (yi,..., y k ) 

1. Get an estimate ft of y. 

2. Alice and Bob use shared randomness to obtain random independent samples from [k\, 

j\,..., jet • Run the protocol ir^k c times. In the t th iteration, the protocol is run on in¬ 
puts (xj , ■ ■ ■ ,Xj tk ), (y^- j +1 ,... ,Vj tk ). In the process we obtain answers for various 

coordinates (some of the coordinates will be sampled multiple times and we will obtain mul¬ 
tiple answers for them). 

3. If a coordinate was sampled in the previous step, output the answer ir^k gave for it. If they 
got multiple results on one coordinate, they will output the first one. If a coordinate was not 
sampled, output 0 on that coordinate. 

Protocol 1: Protocol r 

Now let’s analyze this protocol. We first need the following two lemmas to show how to get an 
estimate ft of /i. 

Lemma 5.4. After communicating O(k 0 ' 52 log k) bits, for some specific input (x,y), with success 
probability at least 1 — l/k, Alice and Bob know y(x,y) exactly if y(x,y) • k < k 0 ' 02 , otherwise Alice 
and Bob know that y(x,y) ■ k > A: 0 - 02 . 

Proof. In IBCW98] . they showed that to compute the disjointness between two inputs of length k , 
the quantum communication complexity is 0(\/klog k). The corresponding protocol has constant 
error rate and will find one intersection place. We will use this protocol to solve our problem by the 
following reduction. For each input (xj,yj), we set a* = l Xi = x and bi = 1 yi = y . Then finding (x,y) 
in the input is just like finding intersection between a = (ai,...,a k ) and b = (b \,..., b k ). Protocol [2] 
shows how to finish the task described in the lemma. 

Let’s analyze this protocol. First its quantum communication cost is clear to be O(k 0 ' 52 log A;) 
as the DISJ protocol has quantum communication cost O(Vklogk). Then for each repeat of step 
3, if the DISJ protocol gives wrong answer, we will not do anything. And if the DISJ protocol 
gives the correct intersection, the counter will be increased by one and the intersection place will be 
removed and we can find other intersections. Thus we only have to show with probability at least 
1 — l/k, DISJ protocol gives a correct answer for at least k 0 - 02 times. Assume the DISJ protocol 
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1. Set a and b as we just described. Set cnt = 0. 

2. Do the following step c\ ■ k °' 02 times, c\ is some constant to be figured out in the proof: 

3. Use protocol for DISJ in [BCW98] to find the intersection between a and b, let it be at place j, 
Alice and Bob communicate 2 bits to check if aj = bj = 1. If it is true, then set cnt = cnt + 1, 

dj = 0, bj = 0. 

Protocol 2: Protocol count 


succeeds with some constant probability p. Let Cr denote the random variable for the number 
of correct answers DISJ protocol gives. We know IE [Cr] = p ■ c\ ■ A ; 0,02 . By the additive Chernoff 
bound, the probability that DISJ protocol give a correct answer for at least k 0 02 times is 


Pr [Cr > k om ] = 1 - Pr [Cr < k 002 } > 1 - e -2(p-ci-fc 0 02 -fc 0 02 ) 2 /Pi-fc 0 02 ). 
By picking c\ properly, for example c\ = 2/p, we get Pr [Cr > A; 0 ' 02 ] > 1 — 1 jk. 


□ 


Lemma 5.5. Let e > 10 k °' 005 be some constant. After communicating O (/c a " • n + 2 2n • A: 0,52 log k ) 
bits, with probability at least 1/2, Alice and Bob agree on some p, such that for any (x,y), 


n(x,y) 


< 1 + e. 


Proof. We use the following protocol to estimate p: 


Inputs: (xi,..., x fc ) and ( 2 / 1 , — , Vk) 

1. Sample the coordinates randomly A; 0 ' 99 times using public randomness (with replacement). 
Alice and Bob exchange their input for these coordinates. For each (x,y) € {0, l} n x {0, l} n , 
count the number of times it appears in these coordinates and denote the count by ci(x,y). 


2. For all (x,y), use Lemma EH to count the number of times (x,y) appears in the input and 
denote the count obtained by C 2 (x,y). 


3. We combine c\ and C2 as C3. For each (x,y), if C2 (x,y) > A: 0 ' 02 , let C3 (x,y) = ci (x,y) ■ A: 0 ' 01 
otherwise C3 (x,y) = C2 (x, y). 


4. fi{x,y) 


C3 (x,y) 
'E x >,y'C3 


Protocol 3: Estimate p 


Let’s Hrst analyze the communication cost of this part. It’s clear that the first step needs at most 
O ( k °- 99 n) communication. For second step, by Lemma 15.41 it needs at most O (2 2n • A: 0 " 52 log A;) 
communication. Sum them up, this protocol needs O (A: 0 99 • n + 2 2n ■ A; 0 ' 52 log A:) bits of communi¬ 
cation. 

Then let’s consider the following events: 

1. For all (x, y) such that p(x,y) ■ k > k °' 02 , |ci (x, y) ■ A: 0 ' 01 — p (x, y) ■ k\ < | p (x, y) ■ k. 

2. For any (x,y), the protocol described in Lemma 15.41 does not fail. 


27 








If these two events happen, then we know that |C 3 (x, y) — y (x, y) • k\ < ^y (x, y) ■ k, therefore 
as desired, 


P>(x,y) = 


C 3 (x, y) 


< 


(1 + f) y(x,y) • k 


Ex'y c 3 (x',y') (!-§)■ k 


< (1 + e)y(x,y). 


Finally, we only have to make sure that these two events happen with probability at least 1/2. For 
the first event, by the multiplicative Chernoff bound and union bound, it does not happen with 
probability 


2 2n • Pr[|c 3 (x, y) /k 9 ' 99 — y (x, y) 


0.99 


e . , (e/3 ) 2 »(x,y)k°" 

>3 y(x,y)\<2ke 3 


< 2ke-^ k ° m ' 27 < 1/4. 


For the second event, by Lemma 15.41 and the union bound, it does not happen with probability at 
most 2 2n ■ £ < 1/4. Thus these two events happen with probability at least 1/2. □ 

Let’s consider the communication cost of r. For the first step, the cost is O (k 0 ' 99 ■ n + 2 2n • k 0 ' 52 log k') 
o(k). For the second step, the quantum communication complexity is at most c • l. For the third 
step, the cost is 0. Therefore QCC (t) < c -1 + o (k). 

Let’s say that the protocol r succeeds when the following things happen: 

1. For all (x,y), < 1 + e. 

2. The c runs of protocol in step 2 of protocol r all compute at least (1 — a) k coordinates 
correctly. 

3. Number of i € [k] such that the coordinate i is not sampled in step 2 of protocol r is at most 
2 ~ c / 2 k. 


If t succeeds, then it computes at least (1 — 2 c / 2 — ca) k coordinates correctly. This is because 
errors come from two possible ways: 

1. Some coordinates are not sampled. When r succeeds, the number of coordinates that are not 
sampled is at most 2 ~ c ! 2 k. 

2. Some coordinates’ results are wrong in step 2. When t succeeds, the number of errors from 
step 2 is at most ack. 


Finally, let’s analyze the success probability of protocol r. Let’s analyze step by step: 

1. For step one, by Lemma 15.51 it is clear that we succeed with probability 1/2. 

2. For step two, first we know that when running on distribution y k , we succeed with 

probability at least 7 . And since we have for any (x,y), < 1 + e, if we run vr ^ on 

distribution y k , the success probability will be at least ^ - k . When running this protocol c 

/ \ C. 

Note that when we 


times independently, the success probability will be at least 


(i+cr 


sample coordinates independently at random, the distribution we induce is yr. 
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3. It is only left to analyze the probability that number of coordinates not sampled in step 2 
of protocol r is at least 2 ~ c / 2 k. For each coordinate i, define Si to be the random variable 
that indicates whether coordinate i is sampled or not (1 means not sampled and 0 means 
sampled). Then we have E[s*]= (l — ^) < 2~ c . In order to show the failure probability 

small by Chernoff bound, we will show that all the Sj’s are negatively correlated. To show 
they are negatively correlated, we only have to show 


V/ C [fc],Pr 


rh =i 


iei 

kc 


< IJPr[si = 1]. 


i&I 


Notice that Pr [n,e/ Si = -*-] = — x) an d P r [ s * = 1] = (l — ^) k °■ So we have, 


VI C [fc],Pr 


n-= i 


Lie/ 


-<■-?) 


kc 


< i- 


k 


pr 


kc 


I Pr [«i = !]• 


iei 


Since all the sf s are negatively correlated, by Chernoff bound for negatively correlated random 
variables, for example see m , we have that the failure probability 


Pr 


Si > 2 c ! 2 k 


,i= 1 


< e 


- 2 fc( 2 - c / 2 - 2- c ) 2 < e - 2 2 “ 2c fc < 2 ~ 2 2—2c fc 


< 2 ~ 


The second inequality holds for all c > 3. Notice that the event that we err in the first step is 
independent from the event that we err in the second step. So the success probability of t is at 


least \ ^ 


(1 +< 0 " 


- 2 


—2 2-2c fc 


□ 


5.3 Lower bound on QIC 

Definition 5.6. We say that QCC(f k , r/ifc, 772 ) < C if there exists a protocol 7 r for f k s.t. 
QCC{tt) < C and 

Pr[ir computes > r]\k coordinates correctly] > 1 — rj 2 

Here the probability is over randomness of protocol (which includes the randomness due to quan¬ 
tum measurements). We don’t require the protocol to declare which coordinates were computed 
correctly. 

Theorem 5.7. There exists an absolute constant r] > 0 s.t. for any boolean function f, QlCoif , r ?) 

>n(GDM 1/5 (f)-0( 1)). 

Proof. Let rj > 0 be a sufficiently small constant to be fixed later. Suppose max^ QIC(f, /jl, rj) = L 
We will show that for sufficiently large k, it holds that 

QCC(f k , (1 - e sh )k, 1 - 2-^ fc ) < 0(k ■ (I + 2)) + o(k) 

from which the theorem follows from Theorem 13.161 
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By definition, for all fj,, there exists a protocol 11^ for / s.t. QIC(H^ L , //) <1 + 1 and error < 77 w.r.t 
jjL. By lemma ESI for sufficiently large k, there exists a protocol II s.t. QCC(Ilk, i i,e') < k(I + 2) 
and 

Pr[IIfc jM)e / computes > (1 — 2rf)k coordinates of f k correctly] > 1 — e~ 2r> k — e 

Here the probability is over the distribution /i k and the randomness of the protocol. Choose k large 
enough and e' small enough so that 1 — e _2r? k — e' > 0.9. Then by lemma ESI for any integer c > 0, 
any constant e > 0, there exists a protocol t s.t. 


Pr[r computes > (1 — 2 C//2 — 2crf)k coordinates correctly (on any input (xi,..., Xk, yi, ■ ■ ■, Vk))\ 



(G^y— 


Here the randomness is only over the randomness of the protocol. Also QCC(r) < c-k-(I+2)+o(k). 
Choose c = [2 log (rr)l • Also choose 77 = Then 

1 _ 2 " c / 2 - 2c?7 > 1 - e sh 


Since 2 2x > 1 + x for all x > 0, it follows that 


0.9 


> 0 9 C • 2~ 2 e c k > 2^( 2efc+1 )' c > 2~ 4 ' e ' c ' k 


.(1 + e) k J - 

The last inequality is true for sufficiently large k. Now choose e = 6g h /100c. Then since 

2- 22_2cfc < 2-4 fc / 16 

we get that 

1 ( ( °' 9 .y _ 2- 22_2cfc ^ > I (V 4*/25 - 2-4A/ lf -^ 

> 2 “ 4 fc / 16 

2 e sh^ 


2 \\{l + e) 


The second inequality holds for sufficiently large k. Hence QCC(t) < c • k ■ (/ + 2 ) + o(k) and 

Pr[r computes > (1 — e s h )k coordinates correctly (on any input (xi,..., Xk,yi, • • ■, Vk))} 

2 


which implies that QCC(f k , (1 — e s h )k, 1 — 2 £shk ) < 0(k ■ (I + 2 )) + o(k). □ 

Corollary 5.8. For all boolean functions f, QCC(f, 1/3) < 2 °( < TfC , (/,i/ 3 )+i). 

Proof. We will use the following folklore result: 


R(f, 1/3) < 


(disc(/) 


O(l) 


30 






where R(f, 1/3) is the (public-coin) randomized communication complexity of / with error 1/3 and 
disc(/) = min^ disc^(/). See, for example, exercise 3.32 in |KN97| . This implies 


QCC(f, 1/3) < R(f, 1/3) < 


0 ( 1 ) 

< 2 o(gom 1/5 (/)) 


(29) 


Now, by theorem 15.71 and theorem 14.131 we get that QIC(f,rj) > £l(GDMi/ 5 (f) — 0(1)) for 
some small constant rj. By lemma 13.151 we also get that QIC(f, 1/3) > £i(GDM 1 / 5 (f) — 0(1)), 
which combined with equation (I29|) completes the proof. □ 


6 From AND to Disj 

In this section, we show that a protocol with low quantum information cost for AND implies a 
protocol with low quantum information cost for Disjointness 

Lemma 6.1. 

ma xQIC(DISJ n , v,2/n) < n ■ QICkiAND , 1 In 2 ) + 0(r • log 5 (n)) + o(y/n) (30) 

V 

Proof. Let QIC^AND.l/n 2 ) = /. Suppose it is a protocol for AND which has error < 1/n 2 
for all inputs and s.t. max /x s t /i (i i i) = 0 QIC(n, y) <1 + 6, for arbitrary small 6. Using 7 r, we 
will construct a protocol for DISJ n . The protocol will have low information cost w.r.t. any 
distribution v. Suppose Tk is a quantum protocol for DISJk that has worst case error < 1/A; 10 
and communication cost 0(y/k log(fc)). For example, use the protocol from [AA03| and amplify the 
error to 1/A: 10 . We’ll drop the subscript k when it is clear from the context. Consider the protocol 
7 x n described as Protocol HI 


Inputs: (x, y ) € {0, \} n x {0, l} n , (x, y) ~ u 
Goal: check if DISJ n (x,y) = 1 or not. 

1. Alice and Bob share a maximally entangled state (fig ASB that will serve as shared randomness 
in order to sample uniformly at random n/log 3 (n) coordinates from [n] (with replacement). 
Alice has the register Sa and Bob has Sb- 

2. On the random coordinates, run r. Suppose Oa is the output register for Alice and Ob is 
the output register for Bob. Note that all this can be implemented using unitaries. Also note 
either Oa = Ob = 1 or Oa = Ob = 0. 

3. If Oa = Ob = 1, then run 7r on each coordinate. If 7r outputs 1 on any coordinate, then 
output 0, otherwise output 1. If Oa = Ob = 0, Alice and Bob will keep running a dummy 
protocol (for example keep exchanging a freshly prepared register |0) of dimension same as 
to be sent in ir n in the corresponding step). In the end they output 0. 

Protocol 4: Subsampling Protocol 7r n 

We’ll denote the protocol in which 7r is run independently on each coordinate by ir n . First lets 
analyze the error of the protocol 7r n . Suppose (x, y) were disjoint. Then probability that we output 
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0 because of r is at most log 30 (ro)/n 10 < 1/n. And the probability that we output 0 because of 
7 r n is at most n/n 2 = 1/n because of union bound. So error in this case <2/n. If the sets were 
intersecting, even if we don’t output 0 because of r, we will output 0 because of ir n w.p. at least 
1 — 1/n 2 (because on the intersecting coordinate, 1/n 2 is the probability of failure). So in both 
cases, probability of error < 2/n. 

Now lets figure out the information cost of 7r n . For running r, we just bound the information 
cost by communication cost, which is at most y/n/ y / log(n) = o(y / n). The interesting part is what 
happens after r. Lets look at the state of Alice and Bob after r is over. Alice holds the registers 
A r , Oa, Sa, where A r is what is left behind with Alice after r, Oa is Alice’s output register for r and 
Sa is the entanglement register which acts as shared randomness. Similarly Bob holds B T , O b , S B - 
After running i steps of 7r n (just before the (i + l) th message is transmitted), Alice and Bob hold 
registers Ai + \ and Bi + \ respectively, with Ci+\ (the register to be sent next) with Alice if i even 
and with Bob if i odd. Note that the number of rounds of 7r is r. Then the information cost of step 
3 is: 


Y I(C i+1 -R\B i+l ,B T ,0 B ,S B ) + y Y I{Ci+r,R\A i+1 ,A T ,0 A ,S A ) 

i=0,i even 2 = 0,2 odd 

i r-1 i r—1 

< 2 • Y I(Ci+i;R,B T ,0 B ,S B \B i+1 ) + -- Y I(Ci+i;R,A T ,OA,S A \A i+1 ) 

2 = 0,2 even 2 = 0,2 odd 

1 r ~ 1 

<-• Y HCi+r,R,Br,OB,SB,A T ,OA,S A \B i+1 )+ 

i=0,i even 

i r_1 

- • Y, I(Ci+i‘,R,B T ,0 B ,S B ,A T ,OA,SA\Ai + i) 

i=0,i odd 

r-l r-1 

= — ■ Y^ OA\Bi+\) + — • 'Y J ^(Cj+ii R, B t , S b , A t , SaIB^i, Oa)+ 

i=0,i even i=0,i even 

i r_1 

- • Y I(Ci + i;0 B \B i+1 ,R,B T ,SB,A T ,SA,0 A )+ 

i=0,i even 

r-1 r-1 

- ■ Y^ ^(Ci+ii 0^|A, i+ i) + Y^ ^(Cj+ii B t , S b , A t , SaIAi+i, Oa)+ (31) 

i=0,i odd i=0,i odd 

l r ~ 1 

- ■ Y I{Ci+i;0 B \A i+1 ,R,B T ,SB,A T ,SA,OA) 

i=0,i odd 

l r ~ 1 

< 2 ' Y I(Ci+ 1 ]R,B T ,S B ,A T ,S A \B i+1 ,0 A )+ 

i=0,i even 

i r_1 

- • Y I(Ci + i;R,B T ,S B ,A T ,SA\A i+1 ,0 A ) + 0(r) 

i=0,i odd 
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= \- Y ^[0 A = l}-I(C i+1 -R,B T ,S B ,A T ,S A \B i+1 ,OA = l)+ 

2 = 0,2 even 

1 r_1 

-• Y V40A = l}-I(C i+ r,R,B T ,SB,A T ,S A \A i+1 ,0A = l) + 0(r) 

2 = 0 ,i odd 

The first two inequalities are by properties of mutual information. The first equality is just chain 
rule. Third inequality follows from the fact that O a ,O b are one dimensional systems. The last 
equality is true because O b is just a copy of O a , so tracing out O b , O a becomes a classical system 
and also conditioned on O a = 0, the mutual information expressions are 0 since in that case the 
Cj+i registers are independent of everything else. Now lets analyze the term: 


r—1 


i 


r—1 


Y HCi+i-,R,B T ,S B ,A T ,S A \B i+1 ,0 A = l) + -- Y I(C i+1 -,R,B T ,S B ,A r ,S A \A i+1 ,0 A = l) 

2 = 0,2 even 2 = 0,2 odd 


We claim that this is equal to QIC(ir n ,i/'), where v' is the distribution v\O a = 1. This follows 
from the following observations: 

• Since O b is just a copy of O a , for all i, the state of systems Ai + 1 , Bi + 1 , Ci+ 1 , R, B T , S B ,A T , S A 
conditioned on 0 A = 1 (the post-measurement state if O a is measured and the result is 1) is 
pure. 


• For all i, the marginal state of systems Hj + i, Bi + 1 , Ci + \ conditioned on O a = 1 is the same as 
it would have been if 7r n was run starting from the distribution u'. This is because 7r n never 
touches the registers B T , S B , A r , S A . 


• If 1^)^ ’ A ’B,C an( j | ^R,A,B,C are ^ wQ p Ure s t a tes such that Tr/j/ \(f>) 
Then I{C] R'\B ) 0 = /(C; R\B)^. 


R',A,B,C 




R,A,B,C 


Remark 6.2. The reader might have noticed that the trick of merging stuff with the purification 
register and then applying the last observation is used at a lot of places in this paper. This seems 
to be a very useful trick and seems to replace the classical Proposition 2.9 from [Bra m- 
Putting it all together, we have the following upper bound on information cost of step 3: 


Pr[0 A = 1 ]-QIC(n n ,i/) + 0(r) 

<Pt[O a = 1\- (YQ^i+0(r) 


\ i— 1 


< Pr[OA = 1] • n ■ QIC ^7r, ^ v-/rij + 0(r ) 

< Pr [O a = 1] • n • (I + 5) + 0(Vr[0 A = 1] • n • rH(w )) + O(r) 


(32) 


Here is the marginal distribution on the i th coordinate and w = *4(1, l)/ n - First inequality 

is by lemma 14.141 Second inequality is just concavity of information cost, lemma 13.131 The last 


33 








inequality follows from corollary 14.91 Now we can assume that Vi[Oa = 1] > 1/n, otherwise (13211 
is trivially bounded by 0(r). Now let us bound w. Suppose (X,Y) are random variables s.t. 
(X,Y) ~ v. Also let N{x,y ) be the number of intersections in x and y i.e. number of i such that 
= Hi = 1- Then 


Pr [ATpr, Y) = d|0, = 1] = Pr[N(Jr.y) = ^Pr[Ox = lWJT.y) = .i| 

P r IVA — 1J 


< Pr[AT(A, Y) = d] ■ Pt[O a = l|iV(X, Y) = d] ■ n 


< Pr[iY(X, Y) = d] ■ 



n/ log 3 (n) 


+ 


log 30 (n)\ 

n 10 J 


■ n 


< e -WM . „ N° g3 ° (,>) 

n 9 

The second inequality follows because if there are d intersections, then getting no intersection 
in n / log 3 (n) uniformly random coordinates is at most the first term. The second term is due to the 
error of the amplified protocol for disjointness. So for d > 91n(2) log 4 (n), Pr[A r (A, Y) = d\OA = 
1] < 1/n 8 . Thus 


n 

w = ^z4(T 1 )/n = E {X}Y) ~ u ,N(X,Y)/n < 0(log 4 (n)/n) 

1=1 

Thus we can bound (|32l) as follows: 

Pt[Oa = 1] • n ■ (I + (5) + 0(Pt[0a = 1] • n • rH(w )) + O(r) 

< n ■ (/ + 5) + 0{n ■ rH{w)) + 0(r) 

< n ■ (I + 5) + 0(r log 5 (n)) 

Since 5 was arbitrary small, this completes the proof. 

□ 


7 Proof of the main result 

We now put everything together to get a lower bound on QICq(AND, 1/3). 

Lemma 7.1. For all r, it holds that 

QICq(AND, 1/3) >n( 1 g ) . 

V r ■ log r J 

Proof. We know by theorem 13.171 that GDMi/ 5 (DISJ n ) > Q(y/n). Hence, by Theorem 15.71 we 
must have that max M QIC(DISJ n , /x, 2/n) > Putting this together with Lemma l6Tl and 

Corollary 14.171 and let r = 0 ( lo /? n 'j , we have, 

QIC r 0 (AND, 1/3) = n ( 1 2 ) = n (^s-) . 

V v n ' n / V r • log r J 

□ 
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Corollary 7.2. Let n* be the distribution such that n*( 0,0) = 1/3, /x*(0,1) = 1/3, /u.*(1,0) = 1/3. 
Then 

inf QIC(U, fjL*) = n( / 8 ) . 

YleT r (AND,1/3) \r • log 8 r J 

Proof. For any distribution /j,q such that y o(l, 1) = 0, it is easy to see that n* can be written as 
n* = \/x o + |// where / 1 ' is some other valid distribution. By Lemma 13.131 we have 

QIC(U,fi*) > 1 -QIC(J1 ,/l 0 ) + ^QIC(U,n') > ^QIC{U,/i 0 ). 

Then we have 

QIC(U,fi*) > \ max QIC{XI, no). 

Therefore by Lemma 17. 11 we have 

inf QIC( II,//) > ~QICUAND , 1/3) = Q ( - l -^-] . 

(AND,i/3)^ V 0V ’ ' 1 V r ■ log 8 r J 

□ 

Theorem 7.3. For all r,n£ N, QCC r (DISJ n , 1/3) = ^( 7 ^ 7 )- 

Proof. Combining Lemma 14.201 and Lemma 17.1) we get this theorem. □ 


8 Low information protocol for AND 

In this section, we exhibit a 0(l/r) information 4r-round protocol for AND (w.r.t. the prior 
1/3,1/3,1/3, 0) which computes correctly on all inputs with probability 1. The protocol is due to 
Jain, Radhakrishnan and Sen. Consider the protocol described in Protocol 0 

First let us see why it computes AND. Let = cos((f*’ y ) |0) +sin(</>*’*') |1) be the state of 

qubit C after i rounds when the input is ( x , y). If the input is 0, 0, (jr’° is always 0. Also when the 
input is 0,1, fjl’ 1 is always 0. So V^’ 0 ^) = 

the trajectory 26 -A 26 —> 0 -A 0 -A 26 —> 


= |0> always. When the input is 1,0, follows 
V’lr— 17 1 = |0) as well. When the input is 1,1, 
A ■■■ —> — 7t/2. So ip 4 r_i^ = — |1). Thus the 


So 


4> l f l follows the trajectory 26 -A —26 -A 46 -A —46 -A ■■ ■ 
players compute AND correctly. 

Now let us analyze the information cost of this protocol. Note that after i rounds the full state 
can be written as follows: 


I i’i) 


XYCR 


E ^) x \v) Y \x, V ) R 

x , y s.t. x A y = 0 v 


Then information cost is given by: 


.. 4r—1 4r—1 


z=l, odd 


2 = 1 ,even 
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Inputs: (x,y) € {0,1} X {0,1} 

Goal: compute AND(x,y ) 

1. Set 9 = gp. Let |v) be the vector cos(0) |0) + sin(0) |1). Let U v be the unitary opera¬ 
tion of reflecting about the vector |v) i.e. U v |0) = cos(20) |0) + sin(20) |1) and U v |1) = 
sin(20) |0) — cos(2$) |1). Also let Z be the unitary operation of reflecting about |0) i.e. 
Z |0) = |0) and Z\l) = - |1). 

2. Alice starts by preparing a qubit C in state |0). 

3. If x = 0, Alice applies the identity operation on C and sends it to Bob. If x = 1, Alice applies 
the U v operation on C and sends it to Bob. 

4. If y = 0, Bob applies the identity operation on C and sends it to Alice. If y = 1, Bob applies 
the Z operation on C and sends it to Alice. 

5. After 4r — 1 rounds, Bob measures the register C. If the result is 1, then he answers 1, 
otherwise 0. He also sends this to Alice. 

Protocol 5: Protocol for AND 


Let us look at a particular term: 


I(C; R\Y)^ = H(C, Y)^ + H(R, Y) A - H{C, R, Y) A - H(Y) A 
= H(C, Y)^ + H(C, X)^ - H(X)^ - H(Y 
= H(C\Y) a + H(C\X)^ 

= \h(C\Y = 0) A + \h{C\Y = 1)^ + \h{C\X = 0)^ + \h{C\X = 1)* 
= \h{C\Y = 0)^ 


First equality is by definition. For second equality, we are using the fact that for a pure state on 
some systems A, B, H(A) = H(B). Third equality is again by definition. For fourth equality, we 
use the fact that if we trace out R, X , Y become classical. For the fifth equality, we use the fact 
that conditioned on h = 1, system C is in a pure state, namely Similarly conditioned on 


X = 1, it is in state 
Y = 0, C is in the state: 


Conditioned on X = 0, C is in the state |0). Now conditioned on 



This is |0) if i = 3(mod 4) and if i = l(mod 4), the density matrix is given by: 


\ + \ cos 2 (20) i cos(20) sin(20) 
^ cos(20) sin(20) |sin 2 (20) 


Eigenvalue computation shows that H(p) = H(s'm 2 (9)) = 0(9 2 log(l/6)) = 0(log(r)/r 2 ). So some 
of Alice’s terms are 0 and some are 0(log(r)/r 2 ). Similarly some of Bob’s terms are 0 and some 
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are 0(log(r)/r 2 ). So in total we get that the information cost is 0(log(r)/r). Note that from the 
protocol it might seem that since the roles of Alice and Bob are asymmetric, only Alice is sending 
information and Bob is not. However this definition of quantum information cost also accounts for 
sending back information in some sense. For example, in some of the rounds, Alice is sending Bob 
some information but Bob is sending it back, so that is accounted for. This results in Bob’s part 
of the cost to be non-zero and in fact equal to that of Alice. 

Now let us see what happens if we place a small mass w on (1,1) entry. Then the full state can 
be described as follows: 


\^i) 


XYCR 


E /+ |i)X ls)1 ' ^' ,)C ^y) R +^m x i+ l+T u. + 


x , y s.t. x A y = 0 

The i th term of the information cost as before is given by: 


^-jY-H(C\Y = 0)* + +++(C|Y = 1)* + = 0)„, + = 1)* 

= 2(1 g U) H(C\Y = 0V, + l -±^H(C\Y = 1)* + 1±2?!-H(C\X = 1)^ 

As before H(C\X = 0+ = 0. But the other three terms are non-zero. H(C\Y = 0)^ is the same 
as before. Let us focus on H{C\Y = 1+. State of C conditioned on Y = 1 is given by: 


1 — w 


+) (+ 1 


1 + 2w 

For i odd, the density matrix is given by: 


+ 


3 w 


1 + 2w 


ti’ 1 ) (V’i’ 1 


P = 


i++ + TYiw cos ((* + i) 0 ) irk cos + + i) 61 ) sin ((* + i) 61 ) 


3w 


l+2w COS ((* + 1 ) 0 ) Sin ((* + l )°) 

Eigenvalue computation shows that 


3w 

1+2 w 


sin 2 ((i + 1)0) 


H(p ) = H 


1 / -i 12w(l—w) sin 2 ((z+l)0) 

1 “ A/i -(l+2«0*- 


Now assuming w < 1/6 and considering i such that sin 2 ((f + 1)0) > 4/5, we get that 


1 - 


^ 12w(l—w) sin 2 ((z+l)0) 

1 (l+2m) 2 


> 


1 - Jl- 


Sw 

( i+2w) 2 


1 - 


1—2m 
1+2 w 


2 
2 w 


1 + 2 w 


Since other terms involving w either have positive contribution or are of lower order, we get that 
for a constant fraction of the rounds, the information cost term increases by an additive 
And hence overall the increase in information cost is at least Q(rH(w)). 
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